STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation
نویسندگان
چکیده
Video Instance Segmentation (VIS) is a task that simultaneously requires classification, segmentation, and instance association in video. Recent VIS approaches rely on sophisticated pipelines to achieve this goal, including RoI-related operations or 3D convolutions. In contrast, we present simple efficient single-stage framework based the segmentation method CondInst by adding an extra tracking head. To improve accuracy, novel bi-directional spatio-temporal contrastive learning strategy for embedding across frames proposed. Moreover, instance-wise temporal consistency scheme utilized produce temporally coherent results. Experiments conducted YouTube-VIS-2019, YouTube-VIS-2021, OVIS-2021 datasets validate effectiveness efficiency of proposed method. We hope can serve as strong baseline other instance-level video tasks.
منابع مشابه
Low-Rank Spatio-Temporal Video Segmentation
Robust Principal Component Analysis (RPCA) has generated a great amount of interest for background/foreground estimation in videos. The central hypothesis in this setting is that a video’s background can be well-represented by a low-rank model. However, in the presence of complex lighting conditions this model is only accurate in localised spatio-temporal regions. Following this observation, we...
متن کاملSpatio-Temporal Segmentation of Video Data
Image segmentation provides a powerful semantic description of video imagery essential in image understanding and efficient manipulation of image data. In particular, segmentation based on image motion defines regions undergoing similar motion allowing image coding system to more efficiently represent video sequences. This paper describes a general iterative framework for segmentation of video ...
متن کاملAutomatic Spatio-Temporal Video Sequence Segmentation
In the paper, an automatic spatio-temporal video sequence segmentation algorithm is proposed. To address this very di cult computer vision problem, several novel algorithms have been developed which use both spatial and temporal information. First, a novel temporal segmentation algorithm is developed based on our previous work in motion estimation. Second, an iterative split-and-merge spatial s...
متن کاملVideo region segmentation by spatio-temporal watersheds
We propose a video region segmentation scheme combining spatio-temporal edges and watershed techniques. We consider the video sequence as a 3-D volume and compute color edges within this volume. These color edges form a vector field that is in turn used to obtain an edge function. This edge function is used as a topological surface for a watershed grouping stage. Considering the video as a 3-D ...
متن کاملAdversarial Spatio-Temporal Learning for Video Deblurring
Camera shake or target movement often leads to undesired blur effects in videos captured by a hand-held camera. Despite significant efforts being devoted to video-deblur research, two major challenges remain: 1) how to model the spatiotemporal characteristics across both the spatial domain (i.e. image plane) and temporal domain (i.e. neighboring frames), and 2) how to restore sharp image detail...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2023
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-25069-9_35